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Abstract 

We prove generic versions of the no-cloning and no-broadcasting theorems, apphcable to essentially 
any non-classical finite-dimensional probabilistic model that satisfies a no-signaling criterion. This 
includes quantum theory as well as models supporting "super-quantum" correlations that violate the 
Bell inequalities to a larger extent than quantum theory. The proof of our no-broadcasting theorem is 
significantly more natural and more self-contained than others we have seen: we show that a set of states 
is broadcastable if, and only if, it is contained in a simplex whose vertices are cloneable, and therefore 
distinguishable by a single measurement. This necessary and sufficient condition generalizes the quantum 
requirement that a broadcastable set of states commute. 
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As part of this development several authors, no- 
tably Barrett [9] and Spekkens [51], have recently 
taken up the question of how far the information- 
theoretic novelties presented by quantum mechanics 
are in fact generic in other types of probabilistic the- 
ory. Spekkens constructs an ingenious "toy model" in 
which a limitation on the amount of knowledge avail- 
able to observers is sufficient to yield, among many 
other things, a no-cloning property. Working in a 
framework in which essentially any finite-dimensional 
compact convex set counts as a state space, Barrett 
shows that universal probabilistic cloning is impossi- 
ble in any non-classical finite-dimensional probabilis- 
tic theory. 

A major motivation for Barrett's work was to come 
up with a reasonable physical framework in which 
arbitrary nonsignaling correlations may be obtained 
from measurements on a bipartite system. Such cor- 
relations can be more non-local than quantum the- 
ory allows, and include the super-quantum corre- 
lations that have come to be known as Popescu- 
Rohrlich (PR) boxes, or Non-Local Machines [33, 45, 
10, 58, 47, 15, 14, 11, 32, 53, 54, 13]. His frame- 
work is based on that of Hardy, and in developing 
it Barrett and Hardy have essentially reinvented the 
finite-dimensional version of a much older framework 
for generalized probabilistic models, based on con- 
vex sets [42, 37, 38, 39, 40, 41, 21, 12, 25, 30], which 
grew out of attempts to axiomatize quantum theory 
within the quantum logic tradition, and which we 
adopt here. 

Popescu and Rohrlich [45] originally raised the 
question of why nature does not allow super-quantum 
correlations, given that they would not violate rcla- 
tivistic causality. In this regard, it is important to 
distinguish the unique features of quantum theory 
from those that would still hold in theories permitting 
more general correlations. Placing PR boxes within 
a framework that also includes quantum theory and 
classical probability theory as special cases, helps to 
understand the common features that these have 
been found to exhibit (see [9] for a discussion of many 
of these). 

In this paper, we completely characterize the sets 
of states that can be cloned or broadcast in any finite- 
dimensional probabilistic model within the convex 



sets framework, obtaining along the way a simple, 
natural, and self-contained proof of the quantum no- 
broadcasting theorem that is substantially simpler 
than the original proof of Barnum, Caves, Fuchs, 
Jozsa, and Schumacher [5], and substantially more 
intuitive and self-contained than that based on Lind- 
blad's Theorem [36] (which, however, provided some 
suggestive ideas). 

In section 2, we sketch the standard framework 
for generalized probability theory, in which arbitrary 
compact convex sets are construed as state-spaces. 
We restrict our attention, in the main, to finite- 
dimensional state spaces. In this context, a state 
space is classical iff it is a simplex. In section 3, 
we discuss the maximal, or injective, tensor prod- 
uct of convex sets, pointing out along the way some 
familiar aspects of entanglement (e.g., entanglement 
monogamy) that hold generically for all non-classical 
models. In section 4, wc prove our generic no-cloning 
theorem. We show that the set of states cloned by 
an affine mapping must be distinguishable from one 
another, with certainty, by a single observable. It fol- 
lows that only when the state-space is a simplex is it 
possible to clone all pure states. 

In section 5, wc show that the set of states broad- 
cast by an affine mapping is contained in a possibly 
larger set of states, the extreme points of which are 
cloned by an affine map. It follows that the extreme 
points of this larger set are distinguishable. In fact 
we show that a set of states is broadcastable if, and 
only if, it is contained in a simplex whose vertices are 
jointly distinguishable. In the quantum-mechanical 
setting, convex combinations of distinguishable states 
commute, so we obtain the quantum no-broadcasting 
theorem as a corollary. Finally, we extend this result 
to show that for any affine map, the set of states it 
broadcasts is a (possibly empty) simplex whose ver- 
tices are distinguishable states. To prove this, we use 
an extension of the classical Perron- Probenius theory 
for (possibly reducible) non-negative real square ma- 
trices. The necessary technical apparatus is collected 
in an appendix. 
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2 The Framework 

To survey all possible probabilistic theories requires 

some altitude. That is, one needs to work in a 
mathematical framework that imposes only the most 
minimal constraints on the structure of probabilistic 
models. Such a framework was constructed, for 
just this purpose, by Mackey [42] in the late 1950s; 
refinements and stylistic variants of this can be 
found in the work of many other authors, including 
Ludwig [37, 38, 39, 40, 41], Foulis and Randall 
[21], Beltrametti and Bugajski [12], Gudder et. al. 
[25], and Holevo [30]. The framework developed 
by Hardy [27, 28] (see also [43]) for an axiomatic 
derivation of quantum mechanics is essentially a 
finite-dimensional version. What follows is simply 
a sketch of this common, more or less canonical, 
framework. 

States and Effects 

We assume that a physical system is characterized 
by its state-space fi, which we take to be convex. We 
write A{Q) for the space of all afRne linear functionals 
/ : O ^ R, and A{^)+ for the space of all nonnega- 
tive linear functionals f : Vt ^ M+. Note that A{il) 
is an ordered linear space, with / < 5 iff /(w) < g{uj) 
for all Lo a Q.. The order unit of A{Vl) is the unit 
functional u given by u{<jj) = 1 for all a; € f2; the 
unit interval in A{Q) is the set [0, v\ consisting of all 
functionals a G A{^) satisfying < a < u (in the 
pointwise ordering on fi). 

We interpret each a e [0, u] as representing an "ef- 
fect" - that is, some possible event or occurrence as- 
sociated with the system - and a{uj), as the probabil- 
ity of this occurrence when the system is in state u). 
There is a natural embedding of fl in A{il)* , given by 
to t-^ uj, where uj{a) — a(Lu) for all a € A{Q). Hence- 
forth, we identify lu with lu, writing uj{a) in place of 
a{uj), as this is in better keeping with the the idea of 
states assigning probabilities to effects (rather than 
effects assigning expected values to states). 

We write V{n) for the span of ft in A{n)* . The 
space V is ordered by the cone V+ consisting of 
of all ^ G y with /i(a) > for every a G A{0,)+. 
Equivalently, /i G V+ iff is a non-negative multiple 



of a state w G fi. Accordingly, we call elements of 
V{^) weights. We say that Q is finite-dimensional 
iff y(0) is finite-dimensional, and compact iff fl is 
compact in the weakest topology making evaluation 
at each a G [0, u] continuous. For the remainder of 
this paper, we make the standing assumption that 
all state spaces arc finite-dimensional and compact 
(equivalently, closed) as subsets of A(f2)*. This 
guarantees that Cl is the closed convex hull of its 
extreme points, which are referred to as pure states. 

Examples 

In constructing examples, one often begins with a 
test space (or manual) [21, 34, 35]: that is, a col- 
lection 21 of (not necessarily disjoint) sets E,F,...., 
called tests, interpreted as the outcome-sets of var- 
ious measurements. Let X = (J 21 be the set of all 
outcomes of all tests G 21. A state on 21 is defined 
to be a mapping w : AT — > [0, 1] summing indepen- 
dently to 1 over each G 21. The collection n(2l) of 
all such states is obviously convex. If each G 21 is 
finite, then it is also compact in the topology of point- 
wise convergence on X [55] . A state is deterministic 
(dispersion-free) iff its value on each outcome x G E 
is either or 1. 

(a) If 21 consists of a single test E, with a finite 
number of outcomes then $1(21) is the set of all clas- 
sical probability distributions over E. This is a sim- 
plex, which we denote by A(£'). 

(b) If 21 consists of two two-outcome tests Eq = 
{aoo, aoi} and Ei = {aio, an}, then $7(21) is a square. 
The index i in atj can be thought of as the "input" , 
corresponding to the choice of measurement to be 
performed on the system, and the index j can be 
thought of as a binary "output". Then, the states 
u) G $1(21) can be thought of as conditional proba- 
bility distributions (or equivalently 2x2 stochastic 
matrices) where p(output = j [input = i) = io{aij), 
and any conditional probability distribution likewise 
defines a valid state. The four vertices of the state 
space are the four deterministic states correspond- 
ing to the choice of a definite output for each possi- 
ble input. Clearly, this construction can be repeated 
for a test space with any number of nonoverlapping 
tests, the resulting state space being an appropriate 
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set of conditional probability distributions. Sncli test 
spaces are often called "semi-classical" test spaces in 
the quantum logic literature [56, 20]. 

(c) If 21 is the collection of all maximal orthonormal 
subsets (i.e. orthonormal bases) of a Hilbert space 
H of dimension at least 3, then 0(21) is canonically 
isomorphic to the convex set of density operators on 
H, by Gleason's Theorem. 

(d) An interesting model, well known in the quan- 
tum logic literature [56, 20], consists of three, three- 
outcome tests E = {a,x,b},F = {6, y,c} and G = 
{c,z,a}, pasted together in a loop. The extreme 
points of f7(2() arc the four dispersion- free states with 
supports {a, 2/}, {fo, z}, {c, x} and {x^y,z], plus the 
non-dispersion-£ree state giving a, 6, c all probability 
1/2 and x, y and z probability 0. 

(e) For another example, let 21 consist of the rows 
and columns of a 3 x 3 array: then Q(2t) is the 
convex set of doubly-stochastic 3x3 matrices, which 
is not a simplex - in spite of the fact that the pure 
states, corresponding to permutation matrices, are 
deterministic. 

Observables 

By a (discrete) observable on a system with state- 
space O, we mean a function F : x i-^ from a finite 
set E into A(fi), satisfying (i) F^ > for all x <E E, 
and "^^^E = u. Any state w S f2 pulls back along 
Fto a probability weight p S A{E) viap(x) = Fx{u}). 
This provides a dual map F* : ^{E) defined as 
F* {u)) = p. Note that this definition of an observable 
generalizes the notion of a Positive Operator Valued 
Measure (POVM) in quantum theory, rather than the 
more specialized notion of an observable associated 
with a self-adjoint operator. 

A special case of an observable is a list (ai, a/j) 
of positive elements of A(fl) that sums to u (in this 
case, the mapping F : {!,..., fc} [0,u] taking i to 
tti is implicit.) Most of the observables considered 
below will be of this type. 

An observable F is said to be informationally com- 
plete, or IC, if and only if the set of functionals 
{-Flrla; e E} separates states, i.e., if Fx{uj) — Fx(p) 
for all X G E implies uj — fi for all states cij,/i e fi. 
(This is equivalent to saying that the dual mapping 



F* : Q ^ A{E) is an affine injection.) Note that F is 
IC if and only if {Fx\x G E} spans A{Q). If this set is 
a basis for A{Cl), we shall say the observable is min- 
im,ally IC. The following result is not new (sec [49] 
for an infinite-dimensional version), but we include a 
proof for completeness. 

Lemma 1 Any finite- dimensional state space sup- 
ports a minimal informationally complete observable. 

Proof: It suffices to produce a sequence (ai,...,a„) 
of vectors Oj e [0,w], with n = dim(A(0)) dis- 
tinct entries, summing to the order unit u. Let 
B = bn) be any basis for A{p.). Without loss 

of generality, suppose that bi = ku, a. multiple of 
the order unit. (If not, apply a suitable invcrtible 
linear transformation). Let c be the minimum of 
inf{6i(w)|w e Q}. Then 6, — cu is positive. Now, 
^j(6i — cu) = (fc — nc)u, with fc — nc > 0. Hence, 
if Oj = {bi — cu)/{k — nc), we have > and 
J^i^^i — Obviously, {ai\i = l,...,n} spans A{fl), 
so (ai, ...,an) is a minimal IC observable. □ 

Operations 

Any physically performable operation on a system 
should respect probabilistic mixtures of states, and 
hence, should be representable by an affinc mapping 
(j) : Q ^ Q', where f2 is the state space of the sys- 
tem prior to the operation being performed, and fl' 
is the post-operation state space. Generally, the set 
of allowed operations in a given model could be a 
strict subset of the set of all affine maps. This should 
be familiar from the quantum case given in exam- 
ple (c), since in that case the affine maps are the 
set of all positive, trace-preserving, linear maps on 
operators, whereas quantum operations are usually 
taken to be completely positive. Since we are con- 
cerned with proving restrictions on the set of op- 
erations available in any model, we assume that all 
affine maps represent possible operations, but the re- 
strictions obviously still apply to any subset of these 
maps. 

Lemma 2 Let E = (ai, ...,a„) be any observable on 
Q, and let 5i, ...,5n S ^' be any states in il' . Then 
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the mapping ^ : f2 — > O' given by 

i 

for allu) €Cl is affine, i.e., an operation. 

The proof is routine. Physically, such a process 
could be implemented by measuring E and then 

preparing the indicated state. 

Notice that any operation n : CI —> il' determines a 
dual linear transformation k* : A{il') given 
by = f{K{(^')) for all effects / G A{n') and 

all states a; G 51. This mapping preserves positivity 
and the order unit, and hence, allows us to pull 
observablcs on Q' back to obscrvablcs on O. (In this 
connection, notice also that if k is injective, k* will 
pull informationally complete observablcs on fi' back 
to informationally complete observablcs onQ.) 



3 Tensor Products 

Given two systems with state spaces Q and Q', 
we'd like to construct a state space to represent a 
coupled system with these as components. There is 
in general no unique way to do this, but rather there 
is a spectrum of candidates, bounded by a maximal 
and a minimal tensor product. 

Definition: The maximal tensor product of two state 
spaces fl and Jl', which we'll denote by (g) fi', is the 
set of all bilinear functionals : x A{n') — > ffi. 

that are (i) positive on pairs (a, b) with a, 6 > 0, and 
(ii) normalized by fi(u, u') = 1 (where u and u' are 
the order-units of A{Cl) and A{Q,'), respectively). 

One can show that the maximal tensor product cor- 
responds to the largest set of joint probability assign- 
ments to measurements on the two component sys- 
tems, subject to a "no-signaling" condition [9, 21, 35]. 

Given states a G J7 and /? G il', one has a product 
state a®l3 given by {a®l3){a,b) = a{a)P(b) 

for all (a, 6) G A{n) x A(0')- 



Definition: The minimal tensor product of 17 and 
n' is the the convex hull of the set of product states 
in f2 (g) Q'. We term such a convex combination a 
separable state, and accordingly denote the minimal 
tensor product by f2 0sep ^' ■ A non-separable state 
in (g) O' will be termed entangled. 

In the present finite-dimensional setting, 
1/(0 (g) n') = V{Q) V{Q') and A(0 (g O') = 
A{n) g) A{n') [31, 35, 40, 41, 55]. It follows that 
O (g O and O igsep ^ have the same affine dimension. 
Hence, every state in O (g f2 can be expressed as an 
affine combination ^i*^'* ® Pii where 'Ylii'ti — 1) 
but the ti need not be positive. 

Examples 

(a) If O and O' are both classical state spaces, say 

Q. = A{E) and O' = A{E'), then 0(gn' = 0(gsepO', 
both being isomorphic to A{E x £"). 

(b) If O and fl' are the state spaces associated 
with the scmiclassical binary-input, binary-output 
test space discussed in example (b) in section 2, then 
OigO' supports all bipartite nonsignaling correlations 
obtainable with two binary inputs and two binary 
outputs. The extreme points are the local determin- 
istic states specifying a definite output for each input, 
and states supporting nonlocal PR-box type correla- 
tions. On the other hand 0(gsepO' only contains local 
states from which no Bell-inequality violations can be 
obtained. More generally, for any pair of scmiclassical 
test spaces ft ^ ft' supports all bipartite nonsignal- 
ing correlations with the appropriate cardinality of 
inputs and outputs, whereas 17 (g^ep O' contains only 
local states from which no Bell-inequality violations 
can be obtained. 

(c) If n and n' are the usual state spaces associ- 
ated with complex Hilbert spaces H and H', then 
n ig O' is properly larger than the usual quantum 
state space associated with H g) H' [21, 35, 34, 6]; 
the minimal tensor product, consisting of separable 
states, is properly smaller. 

Henceforth, by a tensor product for state spaces 
Q and 17', we'll simply mean some convex set 
containing Q iggep ^' and contained in 17 (g 17. 
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Remark: Given affine mappings (p : Q ^ T and </>' : 
O' — > r', there is a unique affine mapping 

(p (g) (j)' : n (g) n' ^ r (g) v 

satisfying (^(8>^;i')(a(8)/3) = (j){a) (j)' {P) for all a e 
and all a' e SI'. In particular, there is no notion 
of "complete positivity" for either the minimal or 
maximal tensor product. That is, the tensor product 
of any two positive linear mappings remains positive 
with respect to either the maximal or the minimal 
tensor cone. 

Marginal and Conditional States 

A state uj e fl^Sl' has well-defined marginal states 
wi G O and U2 S SI' given, respectively, by 

a{wi) = (a (8) u'){u)) and b{u)2) = (w (8> b){w) 

for all effects a G [0,u], b G [0, u']. This fact allows 
us to define conditional states W2^a and wi^h by 

uj{a,b) Lo{a,b) 
(Ji{a) (J2{b) 

We have the expected identities 

u>{a, b) = ujiia)uj2,a{b) = Wi_;,(a)w2(6). 

The following observation is familiar in the setting 
of both classical and quantum probability theory: 

Lemma 3 If either marginal, oji oruj2, of a bipartite 
state CO in SI ^ SI' is pure (i.e. extremal), then u = 

Proof: Suppose uj2 is pure. Wc wish to show that 
Lo{a,b) = ijji{a)uj2{b) for all effects a, 6 G [0,u]. Let 
E C [0, u] be any observable. Then we have, 

UJ2 = y^^UJi {a)u)2,a. 

This gives us a>2 as a convex combination of the 
states uj2,a with coefficients uJi{a). As lJ2 is pure, we 
have for each a £ E either u)i{a) = or u)2,a = 1^2', 



in cither case, oj{a,b) = uJi{a)oj2{h) for all b G [0,u]. 
Since E was chosen arbitrarily, this holds also for all 
a G [0,«]. □. 

The tensor product construction can be iterated - 
we can form 

:= n • • • (g) o . 
^ V ' 

n times 

Applying Lemma 3 to this setting, we see that the 

"monogamy of entanglement" [52] is an entirely 
generic phenomenon. Thus, for instance, if a; is a 
tripartite state in Oi O2 <8) fis, then we can form 
various marginals, e.g., cji2 G fli (8) SI2, etc., If W12 is 
a pure entangled state, then oj = iOx2 ® W3 - whence, 
W23 = a'2 'S"^3 and ^13 =u)\® hj-^. 

Remarks: 

(1) In the context of abstract convex sets, the max- 
imal tensor product (more usually called the injective 
tensor product) seems first to have been discussed by 
Namioka and Phelps [44]; see also Wittstock [57] for 
a survey. As a model for coupled physical systems, it 
was discussed (implicitly) by Foulis and Randall [21], 
Klay, Randall and Fouhs [35], and Klay [34]. (See 
also [6] and [55]). 

(2) The definition of an entangled state as a state 
not contained in SI ®sep SI' naturally generalizes the 
quantum definition. A pure state is entangled iff it 
has a mixed marginal, and a mixed state is entangled 
if it cannot be written as a convex combination of 
pure product states. (Sec [7, 8] for an even more 
broadly applicable generalization of this definition of 
entanglement to convex operational settings.) With 
this definition, it is easy to sec from Lemma 3 that 
any tensor product properly larger than the minimal 
one contains entangled states. 

4 Cloning 

A deterministic cloning procedure for a state a G 
involves preparing the system in state a, preparing 
a second copy of the system in a particular state /3, 
and performing an operation on the combined system 
that takes the initial state a® (3 the final state 
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a (8) a. Since the initial ancillary state /? is supposed 
to be fixed, we can equally well regard such a proce- 
dure as an affine mapping k : O — > f2 (g) O such that 
K(a) ~ a (g) a. One can also consider probabilistic 
cloning, in which there is a non-zero probability that 
the cloning procedure will simply fail (but we will 
know if it docs). Barrett has shown in [9] that uni- 
versal probabilistic cloning is generically impossible 
in (finite-dimensional) non-classical theories. Here, 
we consider only deterministic cloning, and accord- 
ingly drop the adjective. 

Our aim is to show that a set of states simulta- 
neously cloneablc, must also be sharply distinguish- 
able from one another by a single observable and vice 
versa. Our proof of this is essentially just crystalized 
folklore: cloning allows us to produce large ensem- 
bles of independent copies of each cloneable state; 
performing the same measurement on each of these 
defines an observable on the original system, which 
distinguishes among the cloned states to arbitrary ac- 
curacy, by the law of large numbers. Conversely, if a 
set of states is sharply distinguishable then they may 
be cloned by measuring the distinguishing observable 
and then preparing another copy of the correspond- 
ing state. 

This observation has already been made in the 
quantum case (see [16] for example) and it has also 
been noted that the argument docs not seem to de- 
pend on the details of quantum mechanics, which is 
confirmed by the present result. However, the argu- 
ment need not be true in all conceivable frameworks 
for physical theories, as it depends on the idea that 
any state can be reliably prepared and that distinct 
states are separated by some measurement. This is 
true in the present framework, but theories in which 
the notion of state includes "hidden variables" pro- 
vide counterexamples to this. As a rather extreme ex- 
ample, consider a theory just like the ones described 
here, except that the state of each system is supple- 
mented by a hidden bit that can have value or 1, 
but which has absolutely no effect on measurement 
outcomes. Suppose further that any operation from a 
single system to a bipartite composite system copies 
the value of the hidden bit to both output systems. 
In such a world, we can clone states just as well as 
in the present framework, but nevertheless we cannot 



distinguish between two states that have differing val- 
ues of the hidden bit. 

In the present framework, the existence of a 
cloning procedure will depend not only on the struc- 
ture of the convex set of states, but also on what 
kinds of affine mappings one admits as "physical" 
operations. Indeed, the constant mapping that takes 
every state in ft to the state a (g) a is affine; thus, 
on a liberal understanding of physical operations, in 
which any affine mapping between state spaces is 
physically realizable, every state - mixed as well as 
pure - is (deterministically) cloneable if we do not 
demand that the same map clone more than this one 
state. 

Definitions: Call a finite collection Q!i,...,a„ of 

states 

(a) co-cloneable iff there exists a single cloning map 
K : O — »• f]^ that clones them all, i.e., K{ai) = 
cti cti for every i = l, n, and 

(b) jointly distinguishable iff there exists an observ- 
able E = (flo, ....,a„) with ai{aj) = Sjj. In this 
case, we say that the ai are distinguishable by 
E, or that E is distinguishing for ai, a„. 

In discrete classical probability theory, any finite 

collection of pure states is jointly distinguishable. It 
is important to note that, in general, a pairwise- 
distinguishable set of states will not be jointly dis- 
tinguishable. Indeed, in the case of a binary input, 
binary output, semiclassical test space (see Exam- 
ple (b) of section 2), any two extreme states are dis- 
tinguishable by one of the two tests, but no observ- 
able will sharply distinguish between any three pure 
states. (See also the remark following Corollary 1 
below.) 

In finite-dimensional quantum probability theory, 
the pure states corresponding to two vectors v and 
w are distinguishable in the foregoing sense iff the 
vectors v and w are orthogonal. More generally, we 
have the following 

Lemma 4 Quantum states p and p' are distinguish- 
able iff the corresponding density operators satisfy 
PP' = P'P = 0. 
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Proof: p and p' are distinguishable iff there exists 
a self-adjoint operator < A < 1 with Tv{Ap) = 1 
and Tr:{Ap') = 0. Let p = ^ • tiPi where the Pi are 
rank- one projections associated with unit vectors 
V,, and where the convex coefficients ti are all 
non-zero. If Tj:{Ap) = 1, then, ^^tj(^Vi,Vj) = 1. 
Since < A < 1, < (Avi,Vi) < 1, so wc must 
have {Avi,Vi) = 1 for each i. In other words, each 
Vj belongs to the eigenspace of A corresponding to 
eigenvalue 1. By the same argument, if p' = fjQj 
is a convex combination of rank-one projections Qj 
(with Vj > for all j), the vectors in the range 
of Qj must belong to the 0-eigenspace of A. Ac- 
cordingly, Pi _L Qj for every i and every j, so that 
pp' = p'p = 0. □. 

An easy extension of this argument shows that a 
set of quantum states is jointly distinguishable iff all 
pairs p, p' (with p ^ p') of corresponding density 
operators satisfy pp' = 0. That is, a pairwise dis- 
tinguishable set of quantum states is jointly distin- 
guishable. As noted above, this is not generally the 
case. This is one of many respects in which quantum 
probabilistic models are relatively well-behaved. 

Theorem 1 In any finite- dimensional probabilistic 
theory, using any tensor product, distinct states are 
co-cloneable iff they are jointly distinguishable. 

In outline, the proof is simply the observa- 
tion that, to distinguish among the states to any 
given accuracy, it suffices to produce, by iterated 
cloning, a sufficiently large ensemble of independent 
copies of each cloneable state, and then to apply 
to each copy any observable on which these states 
have distinct distributions. The details are as follows: 

Proof: Suppose first that Q:i,...,a„ are distinguish- 
able hy E = {ai, a„}. Define /t : ^ fi^ by 

n 

k{uj) ='^ai{oj)ai(^ai (1) 

where ag is chosen arbitrarily. As observed in Lemma 
2, this mapping is affine; obviously, K,{ai) = aj (g) aj 
for i = 1, ...,n. 



For the converse, wc use the fact that regardless 
of what tensor product we use! — cloning maps can 
be iterated. Let E C [0,m] be an informationally 
complete observable (as afforded by Lemma 1), and 
consider the iV-fold iterated cloning map kjv : ^ — > 
f2^^, where N is a large positive integer. The set 
En ■■= E"^^ is a partition of unity in A{n'^^). Ev- 
ery sequence x = {xj, ...,X2n) in En determines an 
empirical distribution Px on E, given by 



For each i = 1, n, let 

Ai^N,€ = { X e Sat I IIpx - "ill < e }, 

where denotes the maximum absolute value of a 
function / over E. By the weak law of large numbers, 
if a,, AT := (ai)2^ KAr(a), then ai^N{Ai^N,e) > 1 - e 
for sufficiently large N. 

Let ai^N,e be the unique functional in [0, u] defined 
by ai^N,c{^) = 1^-^ {^jj){Ai,N.c) for all £ O (in other 
words, the pull-back of the set Ai^N,e along k^). We 
then have ai{ai, N,e) > 1 — e for sufficiently large TV. 
Note that, since only finitely many ai are involved, 
we can choose N large enough to make this hold si- 
multaneously for all i = 1, ...,N. We claim that, for 
sufficiently large N and sufficiently small e, {fljjv.e} 
is summable in E, hence, extends to a partition of 
unity. It is sufficient to show that ^i,Ar,e ri^fc,jv,e = 
for i k. To this end. note that since E is informa- 
tionally complete, the distinct states induce dis- 
tinct probability distributions on E. In particular, 
there is some (5 > such that \\ai — ak\\E > <5 for all 
i ^ k. Let e < S/2. If x € Ai^N,€ H ^fc,JV,e) then 

\\ai -PxIIe < e and \\p^ - akWs < e, 

so \\ai — afcllc < 2e < (5 - a contradiction. Thus, 

Ai^N.e n Ak^N,e = 0, as claimed. 

Now let Go = k*{En \ (J. Ai^N,e)- We now have an 
observable En^^ = {ai^N,e\i = 0,1.---,N), such that 
C(i{cii,N,e) > 1 — 6 for cach i. Since [0, e]^ is compact, 
we can choose from among the E^.e a convergent 
sequence of observables Em ~ (ao^m, ■■■,ciN,m) with 
0'i,m{'^i) > 1 — for all i. Thus, for each i, the 
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sequence (oi m) converges in [0,u] to an effect a, with 
= 1- We also have 

JV N 

EUi = lim > aim = lim u = u. 
m ' ' m 
i=0 1=1 

Thus, (ao,...,a„) is a distinguishing observable for 
a„, as advertised. □. 

The familiar quantum no-cloning result follows, in 
view of the remarks about orthogonality preceding 
the proof. The following result shows that only clas- 
sical systems - i.e., those the state spaces of which 
are simplices - allow universal deterministic cloning. 

Corollary 1 Suppose thatai, ..,an are co-cloneable. 
Then the convex hull of ai, ...,an in fl is a simplex. 
Hence, if every finite set of pure (extremal) states in 
Cl is co-cloneable then fl is a simplex. 

Proof: A simplex is the only finite dimensional 
convex set for which each element has a unique 
decomposition into extremal states. Hence, 
let ai,...,an be jointly distinguishable states, 
and let J2i ^i'^i — Si ^i^i = € 0, where 
si,...,s„ and ti,...,t„ are convex coefficients. Let 
E = (ao, ai, a„) be a discriminating observable 
for ai, Q!„. Then s, = ai{w) = tt. □ 

Remark: One can certainly construct non-classical 
theories in which any pair of extremal states is 
distinguishable, and hence cloneable. For example, 
consider a semi-classical test space, that is, a 
pairwisc disjoint collection of outcome-sets. A pure 
state on such a test space amounts to a selection 
of one outcome per test, and any two such states 
are distinguished by any test on which they differ. 
(Single systems in both of the theories GNST and 
GLT considered in [9] are of this form.) 



5 Broadcasting 

We say that a state p £ is broadcast by an affine 
mapping B : ft ^ Q ^ iS the bipartite state 



B{p) has marginals equal to p. The quantum no- 
broadcasting result of Barnum et al. [5] tells us that 
two quantum states are jointly broadcastable iff, re- 
garded as density operators, they commute. Our aim 
in this section is to obtain a characterization of joint 
broadcastability for arbitrary systems. 

Let B : O ^ O (g; O be an affine mapping. We 
define the marginal mappings Bi, B2 : ^ hy 

Bi{p){a) = B{p){a u) and B2{p){h) = B{p){u O b). 

Definition: We say that p G f2 is broadcast by B 
iff Bi{p) = B^ip) = p- that is, iff p is simultaneously 
a fixed point of both Bi and i?2- Let F be the set 
of all states p G ft broadcast by B. Note that F is a 
convex subset of O. Indeed, it is fi-affine, meaning it 
is the intersection of O with an aflane subspace. 

Cloning is a special case of broadcasting. Indeed, 

for pure states of broadcasting reduces to cloning: 
if a is extreme and B{a) has marginals equal to a, 
then by Lemma 3, B{a) = a a. Thus, by our no- 
cloning theorem, there can be no universally broad- 
casting map on a non-simplicial state space. On the 
other hand, all states in the convex hull of a dis- 
tinguishable set of states can be broadcast, simply 
by cloning the extreme points. To be explicit, let 
p = ^ . tiai be a convex combination of co-cloneable 
states Q!i, an, and let E = (oq, a„) be a distin- 
guishing observable for ai, .., q;„. Then the very map 
K used to clone the ai in the proof of Theorem 1, 
namely, 

K : w I— > Lij{ai)ai (g) q,. 

i 

applied to p, yields 

i i 

Taking the first marginal of this, we have 

a{K{p)i) = ^tia{ai) = a^^tiai) = a{p); 

i i 

similarly, the second marginal is also p. Thus, k, is 
broadcasting on A({ai, a2, . • . , an})- 

In fact, the convexity of the set of states broad- 
cast by any map B shows that any map that broad- 
casts F's extreme points broadcasts F. If F's extreme 
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points arc extremal in Q then, as mentioned above, 
a broadcasting map for F must clone them, but this 
is not so in general. Any map of the form 



B 



(2) 



where pi's marginals are both equal to ai and [ai] 
as usual distinguish the ai, broadcasts uj S A{{ai}), 
even though pi may not be (g) aj . 

If r is a convex subset of a convex set Cl, then every 
affine functional a G A{n) defines, by restriction, an 
affine functional ar on T. This gives us a natural 
positive linear mapping a i— > ar from ^(O) to A{r), 
taking the order unit u £ A{Q) to the order unit ur 
in A(T). By a compression of a convex set onto 
r, we mean an idempotent affine mapping P : O — > 
n having range F. The existence of a compression 
implies that the natural mapping A(Q) — > A(T) is 
surjective. 

Lemma 5 Let A : Q ^ fl be any affine mapping 

taking into itself. Then there exists a compression 
of n onto the set of fixed points of A. 

Proof: For each n € N, let 

1 " 

Pn = - ^A'' -.Cl^Cl. 

Since fl is compact, we may assume (passing to a 
subsequence if necessary) that (Pn), converges to a 
limiting affine map P : — > fi. If A{p) = p, then 
clearly P{p) = p; conversely, if p = P(/u) for some 
/i G O, then we have 

1 " 

A{p) = hm -J2a''+Hh) 

fe=l 

1 ""^^ 1 

= lim - ^ J^{^p)- lim -A{p) 

n— >C30 ri — ^ n— *c» ?^ 
fe=l 

1 " 1 

= lim - S^ A^{p) - lim -^(/i) 
+ lim 

n^oo n 



Thus, the range of P is exactly the fixed-point set of 
A, as advertised. Note also that, as P{p.) is a fixed 
point of A, we have P(P(/i)) = P(/i) for any p., i.e., 
P is idempotent. □ 



Lemma 6 Let P : ^ be a compression of a 
convex set fl onto a convex subset T C n. Then (i) 
F (g) F can be regarded as a convex subset of fl^Cl, 
and (ii) the mapping P(g)P:f2(S)f2— »f2(S)f2 has 
range contained inV ®T . 

Proof: We can regard P as a surjective mapping 
from f2 to F. If a is a positive affine functional on F, 
then P*(a) := a o P is an extension of a to a positive 
affine functional on 17. Now for every lo belonging to 
F (g) F, define a bilinear form W : A{fl) x A{Q) M 
by u}{a, b) = w{ar, br); this is obviously positive and 
normalized, so w G O ® fi. The mapping lo i-^ Zu is 
clearly affine; it is also injective, by the aforemen- 
tioned extension property. Identifying ui with cJ, we 
can (and shall) regard F (g) F as a convex subset of 
i7 ig) f2. It now follows (see the remark at the bottom 
of page 5) that P(g)P:f2(8)f2^F(g)Fisa well 
defined affine mapping; composing this with the in- 
jection w (U, we have that P ® P takes $7 (g 17 into 
itself, with range contained in F (g) F. □ 

Theorem 2 LetV be the set of states broadcast by an 
affine mapping P : 17 — > 17 (g) O. Then F is contained 
in the simplex generated by a set of distinguishable 
states in 17. 

Proof: Let a : 17®17 17(gl7be the affine iso- 
morphism that interchanges the two factors. Given 
the broadcasting map P : 17 ^ 17 (g 17, define another 
affine mapping P' : 17 ^ 17(gl7by B' = {B+aoB)/2. 
Note that B' broadcasts every state p G F. Call 
a state p G 17 symmetrically broadcastable iff it is 
broadcast by P', and denote by F' the set of all such 
states. As just observed, F C F'. 

Observe that p G F' iff p is a fixed point of the 
mapping B[ sending p to the marginal P'(p)i. By 
Lemma 5, we have a compression P onto F'. Notice 
that P* : ^(F') A(17) is a positive linear injection, 
with P*(ur') = u (since P*{ut>){uj) = ur'{P{io)) = 
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1, since P{u!) £ r'.) By Lemma 6, we also have a 
mapping Q : T' ^ T' ^ T' given by 

Q{p) = iP®P)iB{p)). 

The claim is that this is universally broadcasting on 
r'. For if /? G r', we have, for all a £ [0, ur'], 

Qi{p){a) = Q{p){a®ur) 

= {{P(E)P)B{p)){a(^u) 

= B{p){P*a^P*Ur) 

= B,{p){P*a) = p{P*a) 

= P{p){a)=p{a) 

(using, in the last step, the fact that P{p) = p, since 
per'). It follows that Qi{p) = p; in the same 
way, one has that Q2{p) = P- Since Q is universally 
broadcasting on F', it must in particular broadcast 
every extreme state a G F'. But then Lemma 3 
implies that Q{a), being a state in F' F' with 
extreme marginals, must be a product state, namely, 
a ® a. Thus, Q is (jointly) cloning for all of F"s 
extreme points. It follows now from Theorem 1 
that these extreme points are distinguishable in F' - 
hence, also in Q (since any observable on F' lifts to 
one on f2). □ 

We now have a quantum no-broadcasting theorem 
as an easy 

Corollary 2 Let T he a set of density operators on a 
Hubert space H. Suppose that there exists a positive 
map (j) '■ -^(H) -^(H) broadcasting each p e F. 
Then the operators in F are pairwise commuting. 

Proof: By Theorem 2, F is contained in a simplex 
generated by distinguishable - hence, by Lemma 4, 
commuting - density operators. It follows that the 
operators in F also commute. □ 

Remarks: 

(1) The standard quantum no-broadcasting the- 
orem applies to a completely positive broadcasting 
map. Our result gives, in the form of the above 
Corollary, a stronger formulation: that no positive 
map between matrix algebras can broadcast two non- 
commuting states. 



(2) As stated. Theorem 2 tells us little about the 
convex structure of the set F of states broadcast by 
a map B (since any convex set can be embedded in a 
simplex). Combining it with the simple observation 
made above near Definition 5 that F is f2-affine, we 
can say more: that F is an afiine section of a simplex 
generated by distinguishable states. Our next result 
is that F in fact is a simplex generated by distinguish- 
able states. 

Theorem 3 Let F he the set of states broadcast by 
an affine mapping B : O — > O (g) O. Then F is a 
simplex generated by jointly distinguishable states in 

n. 

Proof: We maintain the definitions used in the proof 
of Theorem 2. Any state uj E V has a unique rep- 
resentation uj — iOiai as a convex combination of 
the extremal points a; of the simplex F'. Let [a'j]"^Q 
be a measurement that distinguishes the vertices of 
F'. The a'o outcome has probability on all states 
in F', so we may set ai = af, -|- a'l and = a'j for 
2 < i < n to obtain an observable [a^Jf^j^ that still 
satisfies ai{aj) = 6ij. This observable can be used to 
define a restriction map r : — > F via 

n 

rii^) =^<^{ai)a^, (3) 

i=l 

which is afBne and surjective. For any cj £ SI, this 

induces a unique "reduced state" uj^ G F' defined as 
LO^ = r{uj). All these "reduced states" uj^ are deter- 
mined uniquely by an n-vector v'^ of probabilities, 
with components vf = uj{ai). 

Any state w G F satisfies {B„i{ij->)Y — Bm(oj) = co 
for m = 1,2. Therefore Bm{i^) = {Bm{u))y = 

iJ2i'^iBrn{cXi)Y = J2i^iiBm{ai)Y. SiuCC 

{Bm{cei)Y G F', the restriction to F' of the 
map u I— > {Bm{u))Y is a classical stochastic map 
on the simplex F'. This map can be represented 
as a column stochastic matrix that acts on 

the vector v". The ith column of M„ is just the 
vector representative of the image of the vertex 
ai under the map Bm, i.e. v^'^'^'^'K Thus a state 
a; G F' is broadcastable if and only if MmV^ = 
for m = 1,2, that is, if is in the intersection of 
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the fixed-point subspaces of both stochastic matrices 
Mm- We can understand these fixed point spaces 
using the extension of the Perron-Probenius theory 
of eigenvectors and eigenvalues of irreducible non- 
negative square matrices to the case of general (i.e. 
possibly reducible) nonnegative square matrices. 
Appendix A summarizes this theory and proves 
two Lemmas we use. Lemma 7, following easily 
from the extended Perron-Probenius theory, gives 
a basis for the space of fixed points of a stochastic 
map consisting of disjointly supported nonnegative 
vectors, which correspond to distinguishable states 
when normalized. The main technical work of the 
present proof is in deriving from this Lemma 8, 
stating that the intersection of the fixed-point spaces 
of two such stochastic matrices also has (when it is 
not {0}) a basis of disjointly supported nonnegative 
vectors , so that the set of normalized states that 
are fixed points of both maps is the simplex A({t'^}) 
generated by these distinguishable states. Since we 
established above that P is the set of states fixed by 
two stochastic maps, it is a simplex generated by 
distinguishable states. (If the intersection is {0} (as 
it will be for a generic map B), the P = A(0) = 0, 
which we view as a degenerate case of a simplex 
generated by a set of distinguishable states.) □ 

Remark: Although for a given B both V and P arc 
simplices generated by distinguishable states, it is 
easily shown by example that P may be a proper sub- 
set of P'. For instance, let = A({Q:i,a2}) and let 
B : ai I— > ai (8) a2, a2 i— > 0:2 ® ai. Then P = while 
P' = {(ai + a2)/2}. 

6 Conclusions 

In order to understand the nature of information 
processing in quantum mechanics, it is important 
to be able to delineate clearly those probabilistic 
and information-theoretic phenomena that are in- 
deed essentially quantum, from those that are more 
generically non-classical. We have established here 
that several specific features of quantum information 
are generic: entanglement monogamy, and, in finite- 
dimensional theories, the connection between cloning 



and state-discrimination and the no-broadcasting 
theorem. 

One might wonder at this point whether every 
qualitative result of quantum information will turn 
out to be similarly generic, either in non-classical the- 
ories or in all theories. This is not the case, however. 
For example, not every finite-dimensional probabilis- 
tic theory allows for teleportation (this is shown in 
[9] and also follows from the results of [48] on entan- 
glement swapping.) 

Finally, it is worth commenting on the program 
of deriving quantum theory from information theo- 
retic axioms [23, 24, 17] in the light of the present 
work. Any such attempt must begin with a frame- 
work that delineates the set of theories under consid- 
eration. The framework must be narrow enough to 
allow the axioms to be succinctly expressed mathe- 
matically, but broad enough that the main substan- 
tive assiimptions are contained in the axioms rather 
than in the framework itself. The generalized prob- 
ability models discussed in this paper would appear 
to be a natural choice for this task. 

In [17], Clifton, Bub and Halvorson attempt an 
information theoretic axiomatization within a C*- 
algebraic framework, which is narrower than the 
framework adopted here. In fact, the C* framework 
is already very close to quantum theory, in the sense 
that all theories in the framework have Hilbert space 
representations. In the finite dimensional case, quan- 
tum theory, classical probability and quantum the- 
ory with supersclcction rules are the only options 
available. The information theoretic axioms used in 
[17] are: no-signaling, no- broadcasting and no-bit- 
commitmcnt. From these it is shown that there must 
be noncommuting observables in the theory and there 
must be some entangled states. Given the restricted 
nature of the C* framework, this already yields a the- 
ory that looks quite close to quantum theory. 

In contrast, the generalized probabilistic frame- 
work adopted here automatically satisfies no- 
signaling, and we have shown that no-broadcasting 
is generically true of any nonclassical model. Such 
generic models can look very different from quantum 
theory. For example, they include models that sup- 
port super-quantum correlations. An open question 
is whether no-bit-commitment is also generic in the 
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present framework, and it is possible that it does 
place nontrivial constraints on the choice of tensor 
product. Nevertheless, it seems unlikely that these 
three axioms alone would get one particularly close 
to quantum theory. In the light of this, it seems that 
the best hope for future progress in axiomatization 
would be to supplement or replace these axioms with 
things that do not appear to be generic, such as the 
existence of a teleportation protocol. 
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A Perron-Frobenius Theory, 
Fixed Points of Classical 
Stochastic Maps, and Lem- 
mas Used in Proving Theo- 
rem 3 

By a nonnegative matrix (or row or column vector) 
we mean one with real nonnegative entries. By a 
semipositive matrix or vector, we mean one with non- 
negative entries at least one of which is positive, and 
by a positive matrix or vector, we mean one for which 
every entry is strictly positive. A nonnegative matrix 
is called reducible if there exists a permutation matrix 
P such that PMP* has the form: 

irreducible if there does not. Some such permutation 
P will put a general nonnegative square matrix M in 
Frobenius normal form 

/ M" •••0 \ 
M^i •••0 

... 

: : : •••0 
V M^i ^Ks . . . J 

where each diagonal block M^^,I € {!,... is ir- 
reducible. 

The standard Perron-Probenius theory applies to 

irreducible nonnegative square matrices M ^ guaran- 
teeing a strictly positive eigenvector with a real pos- 
itive eigenvalue p{M) greater than or equal to the 
modulus of any other eigenvalue, real or complex 
(thus p{M) is the spectral radius of M). 

A result explicitly stated and proved in [18], and 
also stated in [46] (where its proof is said to be es- 
sentially present in Frobenius [22]) partially charac- 
terizes the real nonnegative eigenvectors of general 
(possibly reducible) nonnegative square matrices that 
correspond to positive eigenvalues. The eigenvalues 
of such nonnegative eigenvectors are pi :— p{M^^), 
and for each diagonal block M^^ in the Frobenius 



normal form of M having a given pi, there is an 
eigenvector whose components with indices (after 
the permutation that gives Frobenius normal form) in 
block / and above arc nonnegative, and whose lower- 
indexed components are zero. It is also possible to 
characterize the eigenvectors in a way which is in- 
dependent of Frobenius normal form by introducing 
the following terminology. An index i has access to 
an index j if there is some finite power p such that 
{MP)ij > 0. In the context of column-stochastic ma- 
trices interpreted as transition matrices, this means 
that probability can eventually leak from state j to 
state i (note the directionality, which is not obvious 
from the term "has access to"). Equivalently, i has 
access to j if in the directed "transition graph" hav- 
ing edges (j,i) (thought of as directed "from i to 
j") where, and only where, M^j ^ (note again the 
nonintuitive directionality opposite the flow of prob- 
ability), there is a (directed) path from i to j. The 
indices in a given subset /, on which an eigenvec- 
tor has positive components, can be characterized 
as mutually having access to each other (a condition 
which identifies those subsets without the need to 
mention Frobenius normal form as we did above). 
Finally, the eigenvectors with a given real positive 
eigenvalue A are precisely the real semipositive linear 
combinations of the eigenvectors, among those whose 
existence is asserted above, having eigenvalue A. 

The next result concerns the fixed point states, 
that is to say the real nonnegative normalized eigen- 
vectors V (^j Vi = 1) with eigenvalue-1 of the 
column-stochastic matrix M . 

Lemma 7 A column- stochastic matrix M may be 
put into Frobenius normal form in such a way each 
of its fixed point states is supported precisely on one 
of the L < K blocks numhered K — L + 1, ...K . The 
restriction of M to these blocks will then be block- 
diagonal. 

Proof: Without loss of generality suppose M is in 
Frobenius normal form, with blocks M^"^ , {I, J S 
{1,...,K}. 

The real positive eigenvalues of a column- 
stochastic matrix must be equal to 1 (because it pre- 
serves normalization). Those of an irreducible prop- 
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crly substochastic matrix (i.e. one for which all col- 
umn sums are less than 1, and at least one strictly 
so) must be strictly less than 1. 

Mkk is column-stochastic, so it follows easily from 
the irreducible Perron-Frobenius theory that px = 1 
and there is an eigenvector whose support is K with 
eigenvalue 1. For any other diagonal block M^^ to 
have an eigenvalue- 1 nonnegative eigenvector, it must 
be the case that all blocks M^^ below it (M < L) 
are zero matrices, for if one of them is not. then M^^ 
is properly column-substochastic. Any such diagonal 
blocks M^^ with p{M^^) = 1 can be put at the 
end of the ordering of blocks (indeed, in arbitrary 
order at the end) by an index permutation preserv- 
ing Probenius normal form. Assume this has been 
done, and let them be blocks K — L + 1 through K. 
Thus by the Cooper/Frobenius result discussed above 
Lemma 7 M has L disjointly supported fixed-point 
eigenvectors, one supported on each of the subsets 
K — L + 1, K. The indices belonging to 1, ...K — L 
thus correspond to vertices on which the fixed points 
of M have zero support. □ 

Lemma 8 Let Mi, M2 be two column-stochastic ma- 
trices. The intersection of their fixed-point subspaces 
is spanned by a set of distinguishable states, so the 
set of normalized states that are fixed points of both 
maps is a simplex generated by distinguishable states. 

Proof: 

We cannot necessarily put both Mi and M2 in 
Frobenius normal form simultaneously. However, the 
block indices in the Frobenius normal form of 
correspond (for each fixed m) to a partition of the 
set of vertex indices into subsets. 

Thus each map's fixed-point space is defined by a 
partition A,„ of the vertices into a set Zm on which it 
has no support, and sets, for which we use variables 
/, I', ... for TO = 1, and J, J', ... for to = 2, of vertices 
each of which supports a strictly positive fixed-point 
vector (resp. w'^), with components t;^ (resp. w^). 
Thus e.g. vj. = whenever k ^ I. We will also define 
vectors v = with components Vk, (w = X] 7 ''^''^ 

with components Wk). We will use the notation 1(1) 
to mean the subset of the pertinent partition to which 
the vertex-index I belongs. 



If Lo is in the intersection of the fixed-point spaces 
of Mi^2 then there exist nonnegative A/, pj such that 

a; = ^A7W^ = ^/ijW^ . (6) 
/ J 

The first way of expressing u enforces that it is a 
fixed point of Mi , the second, that it is a fixed point 
of M2. We now give a procedure for expressing the 
condition w = pjw^ as further constraints on the 
A/ 's taking the form that for some of the /, A/ must 
be zero, while some of the ratios A/ /A/' are fixed by 
the data pj,w^ when 1,1' are both incident on the 
same J. 

To do this, it will be useful to define some relations 
R,S on A := Ai UA2. We say G R H iS GnH ^ 0. 
R is reflexive and symmetric. Let S be its transitive 
closure (i.e. G S P iS there is a finite chain Hi, ...Hn 
such that G R Hi R H2 R ■■■ R Hn R H). S 
is an equivalence relation, so its equivalence classes 
[I]s, [J]s partition A. Moreover, it is easy to see that 
its restrictions 81,82 to Ai,A2 are also equivalence 
relations, and for any given equivalence class [I]s or 
[J]s of 8, the equivalence classes [I]si or [J]s2 satisfy 
UU[7]s, =UU[7]s, (orUU[J]s, = U U [J]s), i.e. the 
sets in them contain the same vertices. 

A fact that will be useful below is that if A^ = 
(or p-' = 0), then A^' = pJ' = for all /', J' G [I]s 
(or [J] 5). The reason is that A/ = implies ujk ~ 
for all A; G /, so for such k, pjw^ = 0, implying 
(since > 0) p^ = 0. In other words. A/ = and 
I R ,J imply pj = 0; the same argument shows that 
Pj = and J R I' implies A// =0; thus the same 
statements hold with 8 in place of R and we see that 
zero coefficients for I (or J) propagate throughout 
[I]s (or [J]s). 

Note that if 7 n Z2 7^ then = 0, p^ = for 
7, J e [I]s. This is because the vectors have posi- 
tive components vl for k € I, but for k € Z2 we have 
uJk — 0, which therefore requires A^ = 0; the above 
observation then applies. 

Let Z' be Zi plus the set of all the vertices that 
this argument shows to have uJk = 0, and A'^^ the 
partitions of the remainder of the vertices agreeing 
with A™. 
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Recall from (6) that the components of w satisfy: the transitivity conditions: 



(7) 



1 1 I'll' I" = III" 



(12) 



for all k. Thus if 7 fl J 7^ then either jij, A/ = or 

Vk/wk for fc € / n J is some constant [3ij := iJ.j/Xi 
independent of A;. So A/ must be zero if there is any 
J with J n 7 7^ for which v is not proportional 
to w on J n 7. As before, the upshot is that if an 
equivalence class X oi S contains sets J, 7 such that 
V is not proportional to w on 7 fl J, the coefficients of 
all sets in X must be zero. This constraint removes 
more indices from the subset on which joint fixed- 
points can be supported (implying the fixed-points lie 
in the subsimplcx of F' with those vertices deleted). 
We therefore define Z", similarly to Z',A'^. Now 
let J{k) 
consider 



It may be the case that one side of this is defined while 
the other side is not, because, for example, although 
some J is incident on both 7 and 7', no J is incident 
on both 7 and I", in which case no further constraint 
arises; but when all are defined, we have (recalling 
the definition of (3ij) that: 



Vk/Wl Vk'/Wi' Vk/Wi 



Vk' / Wl Vk" / Wv Vk"/Wl 



(13) 



Canceling, we obtain an identity so no further con- 
straints arise. 

We have just expressed all the constraints aris- 



J{1) - J but I{k) — I - I' and -j^g ^^.^^ ^ ^ Y^j fJ-jW^ as constraints on the A/. 



vi/wi A/(/s) 



(8) 



If A/ 7^ then fij,Xi' ^ and we get the require- 
ment: 

^1(1) Vk Wl 



^i(k) 



Wk VI 



I.e. 



A/ 



(9) 



(10) 



As promised, some of the constraints coming from 
M2 have fixed the ratio of A/ and A/'. Any J' ^ J 
incident on both 7 and 7' must give rise to the same 
ratio X'j/Xi; that is, 



Pij/Pi'j = PiJ'/Pi'J' 



(11) 



Should this not be the case, our assumption that Aj 
must be false, so all A/" = for I" e [7]sj. 

Thus, the ratios A/'/A/ are fixed to 7/'/ := 
f^ij/Pi'j within those /S-equivalence class for which 
the RHS is independent of J, while all A/ = in 
the other S'-equi valence classes. No constraints on 
the Aj arise across S'-equi valence classes. We add the 
zeroed-out vertices to Z" to obtain Z'", and similarly 
obtain A"' as the remaining Si-equivalence classes. 

Some obvious consistency conditions must be satis- 
fied by the ratios 7//' = A// A/' thus obtained, namely 



The other constraint u = Xjv^ gives w as a 
convex combination of distinguishable states , i.e 
U! e A{{v^}). It is evident that fixing tUk = for 
k S Z'" just says the states are in a subsimplex of 
A({u^}), while fixing the ratios of vertices within 
the elements of a partition just says that the states 
are convex combinations of a particular set of dis- 
jointly supported, and therefore still distinguishable, 
states in this subsimplex. 

To be rigorous we give an explicit expression for 
uj as a convex combination of distinguishable states. 
Without loss of generality suppose that X]fc ^fe = 



.w 



= Aj = Yj fjbj = 1, so that ui, , and 



w 



are normalized states. Picking representatives 7 g [7] 
from each element [7] of the partition A'/' we begin 
with CO = J2ieuA"' ^i'^^ ^^'^ impose the constraints, 
getting: 

= E ^/ E (^^/^/)^' = E ^/ E ^n^' ■ (14) 

[f] I' ell] [/] I' ell] 



Define normalized vectors 

«W := ( E ^li^'yi E ^//) 



and scalars 



a: 



II] 



Af ( E ^n) ■ 
I'eli] 



(15) 



(16) 
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To see that these definitions are independent of the 
choice of representative / of [/], recall (cf. (9)) that 

7„-:=^ = ^^, (17) 

A; Wp vi 

for any p £ I, I <E I (independently of our choice of 
such p, I). Now from (15), 

^[^'1 - ^//^fc ^ ^//^fc (18) 

Z]/'e[/] Z^fce/' T/'/^'fe Zl/'6[/]7/'/ 

and we see that the I dependence, which is only 
through the factor jjj on top and on the bot- 
tom, takes the form of factors Vp/wp for some p G I 
on the top and bottom, which cancel establishing 
the claimed independence from the choice of / G [/]. 
Also, 

/e[/] 

showing that this too depends only on [/] . 
With these definitions, (14) becomes: 

'^ = E^i/]^'''- (20) 
[i] 

Since the sets of vertices U U [/] supporting each 
ul^l are disjoint, the v^^"^ are distinguishable, and since 
in addition the nonnegative coefficients Ajjj arc free 
except for overall normalization, T is the simplex 
A({t;f^l}jjj) with distinguishable vertices t; 1-^1. □ 
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